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Abstract 

In the context of change-point detection, addressed by Total Variation minimiza¬ 
tion strategies, an efficient on-the-fly algorithm has been designed leading to exact 
solutions for univariate data. In this contribution, an extension of such an on-the-fly 
strategy to multivariate data is investigated. The proposed algorithm relies on the local 
validation of the Karush-Kuhn-Tucker conditions on the dual problem. Showing that 
the non-local nature of the multivariate setting precludes to obtain an exact on-the-fly 
solution, we devise an on-the-fly algorithm delivering an approximate solution, whose 
quality is controlled by a practitioner-tunable parameter, acting as a trade-off between 
quality and computational cost. Performance assessment shows that high quality so¬ 
lutions are obtained on-the-fly while benefiting of computational costs several orders 
of magnitude lower than standard iterative procedures. The proposed algorithm thus 
provides practitioners with an efficient multivariate change-point detection on-the-fly 
procedure. 
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1 Introduction 


Total Variation (TV) has been involved in a variety of signal processing problems, such as 
nonparametric function estimation mils] or signal denoising [niiniisa. The first contri¬ 
butions on this subject were formulated within the framework of taut string theory [HHS] 
while the term TV had first been introduced in image restoration 01301 ■ The equivalence 
between both formalisms has been clarified in m 

Formally, the univariate TV framework aims at finding a piece-wise constant estimate 
X G of a noisy univariate signal y G by solving the following non-smooth convex 
optimization problem, 


x= argmin ^\\x - y\f + \xk+i - Xk\, (1) 

where A > 0 denotes a regularization parameter balancing data fidelity versus the piece-wise 
nature of the solution. 

Related works: recent developments and issues. It is well known and docu¬ 
mented that the unique solution of the optimization problem © can be reached by iter¬ 
ative fixed-point algorithms. On the one hand, solving this problem in the primal space 
requires to deal with the non-differentiability of the £i-norm that is either handled by 
adding a small additional smoothing parameter |33) or by considering proximal algorithms 
[?.[TH3l[5l [TnUI^[27ll291l31] . On the other hand, one can make use of the Fenchel-Rockafellar 
dual formulation or Lagrangian duality [331I33] that can be solved with quadratic 

programming techniques [41131]. Both primal and dual solutions suffer from high computa¬ 
tional loads, stemming from their iterative nature. To address the computational load issue, 
alternative procedures were investigated, such as the taut string algorithm of common use 
in the statistics literature [26] . Very recently, elaborating on the dual formulation and thor¬ 
oughly analysing the related Karush-Kuhn-Tucker (KKT) conditions, a fast algorithm has 
been proposed by one of the authors in m to solve the univariate optimization problem ©■ 
Compared to the taut string strategy, it permits to avoid running sum potentially leading 
to overflow values and thus numerical errors. Another specificity concerns its on-the-fly 
behavior that does not require the observation of the whole time sequence before a solution 
can be obtained. On-the-fly algorithms might be of critical interest for real-time monitoring 
such as in medical applications [TjUl]. 

Along another line, extension of the univariate optimization problem o to multivariate 
purposes has been recently investigated in [61 1^132] . The multivariate extension arises very 
naturally in numerous contexts, such as biomedical applications, for which the purpose 
is to extract simultaneous change points from multivariate data, e.g., EEC data [16] . It 
also encompasses denoising of complex-valued data, which can naturally be interpreted as 
bivariate data. Multivariate optimization is known as the group fused Lasso in the statistics 
literature |241l35j . From a Bayesian point of view, elegant solutions have been proposed 
in [T4l|28] and efficient iterative strategies have recently been proposed in [6ll21]. 
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Mutivariate on-the-fly TV. In this context, the present contribution elaborates on m 
to propose an on-the-fly algorithm solving the multivariate extension of O- In Section[2l the 
group fused Lasso problem is first detailed. It is then illustrated that the multivariate pro¬ 
cedure has a non-local behavior as opposed to the local nature of the univariate problem O- 
Consequently, any on-the-fly algorithm solving the multivariate minimization problem will 
only lead to an approximate solution. The KKT conditions resulting from the dual for¬ 
mulation of the multivariate problem are specified in Section From such conditions, a 
fast and on-the-fly, yet approximate, algorithm is derived in Section 3] The performance 
in terms of achieved solution and computational gain are presented in Section [S] A video 
demonstrating the on-the-fly behavior of the algorithm as well as a Matlab toolbox are 
available at http://perso.ens-lyon.fr/jordan.frecon. 

Notations. Let u = {um,k)i<m<M,i<k<N & denote a multivariate signal, where 

for every m € {1,..., M}, Um = {um,k)i<k<N G stands for the m-th component while 
the fc-th values will be shortened as u^, = {um,k)i<m<M G R'^. For every k € {I,..., iV}, 
use will also be made of the following functions: abs(ufc) = (|um,fe|)i<m<M £ R^ and 
sgn(ufc) = {sgn{um.k))j^<^<M ^ 


2 Local vs non-local Nature 


We denote y the multivariate signal of interest. A multivariate extension of (H)) reads: 


x= argmin 




— 1 \ m=l 




( 2 ) 


where A > 0 denotes the regularization parameter and L G R^^ i)xAf denotes the first order 
difference operator, that is, for m £ {1,..., M} and fc G {I,..., V — 1}, 


— Xm,k+1 Xm,k- 


(3) 


Despite formal similarity, there is however a fundamental difference in nature between the 
univariate (M = 1) and multivariate (M > 1) cases: The former is intrinsically local p^[2^ 
while the latter is non-loca^ To make explicit such a notion, we have designed the following 
experiment, whose results are illustrated in Fig. [T] The results associated to the univariate 
(resp. bivariate) case are presented on the right plots (resp. left plots). A univariate signal 
y G R^ with N = 180, consisting of the additive sum of a piece-wise constant signal and 
white Gaussian noise (in gray, in Fig. [T1 right top plot), is considered first. The solution 
of the minimization problem o is displayed in solid red lines in FigH] Also, we search for 
the solution of the minimization problem m applied to two partitions of y, obtained by 
splitting it in half, i.e., y_ = (yfc)i<fc<Ar /2 and y_^_ = {yk)N/2-\-i<k<N- The solutions a;_ and 
x+ of dl]) respectively associated to y_ and y^ are concatenated and displayed with dashed 
blue lines in Fig. [TJ There is strictly no difference between x and the concatenation of X- 
and a;_|_, as reported in FiglT] (bottom right plot), except for the segment that contains the 
concatenation point. The difference around the concatenation point is expected as x makes 
use of an information (the continuity between y_ and y_|_) that is not available to compute 

^In this article, we denote a problem as local if the solution at a given location does not depend on the 
signal located earlier (later) than the previous (next) change-point. 
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Figure 1: Non-local vs. local nature. Left: bivariate TV (upper plots: first component, lower 
plots: second component). Right: univariate TV. Observations y (gray), solution x (red), 
concatenation of solutions x_ and x+ (dashed blue). 

X- and a;+. The fact that there is no difference elsewhere shows the local nature of the 
univariate solution to Problem O- 

This experiment is now repeated for M = 2 (as the simplest representative of M > 1). A 
bivariate signal y = {y 1 , 112 ) S with N = 180, consisting of the additive sum of piece- 

wise constant signals and white Gaussian noises (in gray, in Fig. [U left plots, 1st and 3rd 
lines), is considered. Two partitions, obtained by splitting in half, y_ = {yi,k,y2,k)i<k<N/2 
and y+ = {yi,k,y2,k)N/2+i<k<N are also considered. The corresponding solutions of @, 
applied to y,y_,y+, labeled x, x_ and x+ are obtained by means of the primal-dual 
algorithm proposed in [?] with A = 20. Solutions x_ and x+ of ([2]) respectively associated 
to y_ and y+ are concatenated and displayed with dashed blue lines in Fig. [U while x is 
shown in red. Contrary to the case M = 1, differences between x and concatenated x_ and 
x+, shown in black in bottom plots, differ unambiguously from 0 over the entire support of 
y, clearly showing the non-local nature of x when M > 1. 

In the univariate case (Eq. ([T|)), the local nature of the solution permits to design an 
efficient taut string algorithm, that consists in finding the string of minimal length (i.e., taut 
string) that holds in the tube of radius A around the antiderivative of y. The solution x of 
© is then obtained by computing the derivative of the taut string. An efficient strategy has 
been proposed in m in order to straightforwardly compute x by determining the points of 
contact between the taut string and the tube. Even though this approach can be generalized 
to multivariate signals, the detection of points of contact additionally requires the angle of 
contact between the taut string and the tube. However, this information is non-local and 
thus the on-the-fly minimization problem results in a challenging contact problem which 
can not be solved locally. This interpretation will be further discussed in Section [31 

The non-local nature of the multivariate (M > 1) Problem (O implies that one cannot 
expect to find an exact multivariate on-the-fly algorithm. Therefore, in the present work, 
we will derive an approximate on-the-fly algorithm that provides us a good-quality approxi¬ 
mation of the exact solution to Problem ([2]). A control parameter |Q|, defined in Section 3, 
will control the trade-off between the quality of the approximation and the computational 
cost. 
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3 Multivariate Total Variation Minimization 


3.1 Dual formulation 

Fenchel-Rockafellar dual formulatioiH of (l2|) reads : 


M 


mimmize — 

ueRMx{N-l) 2 


'^\\ym +L*Um\\‘^ subject to (Vfc = V-1}) ||ufc|| < A, (4) 


m—1 


where, for every to S {1,... ,M} and k = {2, ..., N — 2}, 

(Z/ Ujn'jk — 1 '^m,k 

and 

j (Z/ 

(Z/ 1lrri)N —Um,N—l- 

The optimal solutions u G ^mx{n-i) j ^ ^mxn dual problem and of the primal 

problem respectively are related by 


(5) 


( 6 ) 


|(Vto G {1, . . . ,M}) Xm = ym+ L*Um, 

|(Vfc G V- 1}) Ufe G -Aa|| • ||(xfc+i -Xfc). 

From we directly obtain the following necessary and sufficient conditions. 


( 7 ) 


Proposition 3.1 The solutions of the primal problem m and the dual problem o satisfy, 
for every to G {1,..., M}, 

— 2/m d” Z' '^m: (S) 

and, for every k = {1, ... , N — 1}, 


if Xfc=Xfc+i then ||ufc|| < A, 
if Xfe^Xfe+i then Ufc = -A^g±i^ 


(9) 


The first condition corresponds to the configuration where every component keeps the same 
value from location fc to fc + 1. This configuration is illustrated in the bivariate case (M = 2) 
in Fig. [2] (left plot). The second condition models situations where some components of x 
admit change points between locations k and k + 1. An interesting configuration is that 
of non-simultaneous change points as illustrated in Fig. [2] (right plot). In the presence of 
noise, this second situation is rarely encountered. Thus, in the sequel, we will only consider 
simultaneous change points. 


Remark 3.2 Proposition 13.11 for M = 1 leads to the usual KKT conditions associated to 
the minimization problem ©: 


Xfc 

then 

Wfc — +A, 

Xk ^k-\-l 

then 

Uk = -A, 

— ^fc+1 

then 

Uk G [—A, +A] 


( 10 ) 


The on-the-fly univariate TV algorithm proposed in [12] is derived from Conditions cni). 

^Note that, the usual dual formulation and the resulting stationarity conditions would involve u rather 
than —u. The choice made in this article enables us to be consistent with the results obtained in |12| for 
the univariate case. 
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Figure 2: Comparing joint vs disjoint changes in the dual space. Left: location k is suitable 
for a joint negative amplitude change on both components. Right: configuration suitable 
for introducing a negative amplitude change at k on the second component only. 


3.2 Rewriting the KKT conditions 

Contrary to Conditions (11011 . the multivariate conditions derived in Proposition 13.II are not 
directly usable in practice to devise an on-the-fly algorithm because — Xfc is a priori 
unknown at instant k. Therefore, we propose to rewrite the second condition in ([5|) by 
means of auxiliary variables {zk)i<k<N-i such that 

|if Xfc=Xfc+i then ||ufc|| < A, 

[if Xfc 7 ^ Xfe+i then tife = -sign(xfc+i - Xfc) o Zfc, 

with Zfc = where o denotes the Hadamard product. Then, Proposition l3.ll 

+ l Xfcll 

can be reformulated component-wise as follows. 


Proposition 3.3 The solutions of the primal problem m and of the dual problem ® 
satisfy the following necessary and sufficient conditions. There exist nonnegative auxiliary 
variables {zk)i<k<N-i such that, for every m = {1, ..., M} and k = {1,..., N — 1}, 


if 



then 

Um,k — ~\~^m,k'} 


if 

^m,k 


then 

^m,k — ^m,ki 

(12) 

if 

^m,k 


then 

'^m,k ^ 5 ; 



with \\zk\\ = A and = 2/m + L*Um- 

Comparing Eqs. m and (nai highlights the similarity between the necessary conditions 
of the univariate and multivariate minimization problems: Conditions involving A in the 
univariate case involve the auxiliary vector z in the multivariate one. The fact that z differs 
for each pair (m, k) can be interpreted in taut string procedures as the fact that the point of 
contact with the taut string may vary on the tube of radius A. This significantly increases 
the difficulty of deriving an on-the-fly algorithm. 
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3.3 Approximate solution 

If we first assume that z is known and such that, for every k € {1,..., — 1}, ||zfc|| = A, 

the primal problem associated to Conditions (Ha reads 


M 

E 

m—l 


2ll2/m 


N-1 

E 


^m,k I 


(13) 


and can be interpreted as M univariate minimization problems having time-varying regu¬ 
larization parameters (2m)i<m<M- 

The proposed approximation consists in restricting the estimation of z to a predefined 
set Q = chosen such as for every g G {1,..., |Q|}, = (Cm^)i<r„<M € 

satisfies || = A. 

The most naive strategy would consist in solving M univariate minimization problems 
for every |Q| candidate values of z, i.e., find for every m = {1,... ,M} and g = {1,..., |Q|}, 

=argmini||y„-a:^f-hC^)||La;„||i (14) 

Xm Z 

and to devise a method to pick the solution amongst the |Q| candidates. For instance, the 
one that maximizes some quality criterion /, i.e., 

x — with g* = arg max /(x^'^h. (15) 


Although it benefits from parallel on-the-fly implementations, this situation would corre¬ 
spond to a constant estimate z = ^Iat. Therefore, changes in the mean would be 
processed independently on all components and group-sparsity would not be enforced. 

In order to benefit from an on-the-fly implementation and to enforce group-sparsity, we 
propose an algorithmic solution based on a piece-wise constant estimator of z detailed in 
the next section. 


4 Algorithmic solution 

In the following, we first extend the on-the-fly algorithm proposed in m to the multivariate 
case, with z assumed to be known a priori. This strong assumption, unrealistic in pratice, 
permits to describe clearly the behaviour of the multivariate on-the-fly algorithm. Then, 
we will focus on the question of the automated and on-the-fly estimation of z taking its 
values in Q, which consequently introduce a parameter |Q| controlling the quality of the 
approximation. The main steps of the on-the-fly algorithm are summarized in Algorithm [T] 
It is based on the range control of both unknown primal and dual solutions x and u by 
lower and upper bounds updated with the incoming data stream. 

The design of Algorithm [T] results in specifying Rule 1 and Rule 2 allowing respectively 
to detect a change point and to find suitable change-point locations according to Proposi¬ 
tion [TS] 
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Algorithm 1: On-the-fly scheme for multivariate TV 

Data: Multivariate signal y = ..., G . 

Regularization parameter A > 0. 

Starting location fco = 1. 

while ko < N do 

Set fc ■«— fco 

Initialize primal/dual bounds 
while Rule 1 is satisfied do 
Set k <— k + I 
for m ^ 1 to M do 

Update primal/dual bounds 
if Rule 2 is not satisfied then 
|_ Revise the update of primal/dual bounds 

Estimate the change point fcrupt 
Estimate (Xj)fco<j<fcrupt 
_ Set ko i — Abrupt 1 
Result: Solution Xapprox 


4.1 Ideal case with z known 


4.1.1 Lower and upper bounds 


According to Proposition 13.31 the solution of the primal problem, the solution of the dual 
problem and the auxiliary variable have to satisfy, for every k G {0,...,A — 1}, 

{ Ufc+i = yfc+i + Ufc — Xfc+i, 

abs(ufc+i) < Zfc+i, (16) 

l|zfc+i|| = A. 


with uo = uat = 0. Considering the two first conditions, the prolongation condition xa:+i = 
Xfc leads to 


y/c+l ^ 

y/c+1 ^k "^k+l ^k- 


(17) 


Following the solution proposed for the univariate case derived in m, one can check 
that CZl) is satisfied by reasoning on lower and upper bounds of Ufe and Xfc. For every 
k G — 1}, we define the lower and upper bounds of x^, labeled Xfe and x^ respec¬ 

tively, as: 

Xfc < Xfc < Xfc, (18) 

and we set u^, and Ufc as follows 


(VmG M}) 


Um^k 


lLm,k ^m,k 

Um,k if Xm,k 


^m,ki 


(19) 


where Uf. and Ufc appear to be the upper and lower bounds of Xfc respectively, i.e. 


Ufc < Ufc < Ufc, 


as detailed in Appendix 17.11 


( 20 ) 








4.1.2 Updating rules & Rule 1 

The prolongation condition Xfc+i = x^, which has led to (1171) . becomes 


yfe+i > Xfe -Zfc+i -Ufe, 

yfc+i < Xfe + Zfc+i — Ufc. 


(21) 


If the latter condition, labeled as Rule 1, holds, then according to the primal-dual relation, 
we perform the update of the lower and upper bounds at location k + 1 as follows: 


Ufc+i = yfe+i -f Ufe - Xfe, 
Ufc+i = yfc+i -f Ufc - Xfc, 


( 22 ) 


and 


2£A;+1 

^fc+1 ~ ■ 


(23) 


Remark 4.1 Equivalently, one can systematically update primal (resp. dual) bounds ac¬ 
cording to (l22l) (resp. (l23l) l and verify that the following rewriting of the prolongation 
condition (l2T]l holds: 


Ufe+l ^ Zfe+i, 
Ufc+l < -|-Zfe_|_i. 


(24) 


4.1.3 Signal prolongation &: Rule 2 

If Rule 1 (i.e. Condition (1^ or equivalently (l24l) l holds, then the assumption x^+i = x^ 
is valid. However, the upper and lower bounds may have to be updated in order to be 
consistent with Ufe+i G [—z^+i,-l-z^+i]. According to (|20)) . this condition requires to verify 
that the following Rule 2 holds: 


Ufe+l < -fZfe+l, 
Ufc+l > —Zk+l- 


(25) 


For every toG{ 1,...,M}, three configurations can be encountered: 

• When both Conditions (OSl) are satisfied, the bounds are left unchanged. 

• When Ujn.k+i = ^m,k + ym,k+i — > +^m,k+i, then the updating rules specified 

in (l23ll have under-evaluated the bound 


{yj € {ko,...,k + l}) 


(26) 


where ko denotes the last starting location of a new segment. Since is upper- 

bounded by +Zm,k+i and, that for such a value it can be shown (see Appendix 17.21) 


that 


= X. 


■'m,k 


iLm,k+l Zm,k+1 
k — kf) + 1 


(27) 
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we propose the following updates 


(Vj € {fco,...,fc + l}) 

1Lm,k+l ~ +^m,fc+l- 


(28) 


• When Um,k+i < —^m,fc+i, then it results that the upper bound 

l/fn = Xm,j j € {^ 0 ; ■ • ■ f k 1 }) 


(29) 


has been over-evaluated. Similarly, since Um,k+i is lower bounded by —'zm,k+i, we can 
show that the upper bound 


k'm — Xjri.k “t“ 


'am,fc+l “b ^m,k+l 
k — ko + 1 


permits to ensure the consistency of the following updates 


(Vj € {/CQ; ■ ■ ■, k + 1}) 




(30) 


(31) 


4.1.4 Estimate of the change point k^pt 

When Rule 1 does not hold, a change point has to be created. For every m € {1,..., M}, 
we can distinguish three cases: 

• When + ym,k+i - x^^k < -Zm,k+i, then, since is bounded, it 

means that x^ f, is over-evaluated and therefore a negative amplitude change has to 
be introduced on the m-th component in the time index set {kQ,..., k} in order to 
decrease its value. Following Proposition 13.31 and Eq. (l20l) . the set of locations Km 
suitable for a change-point on the m-th component reads: 

Km — {j € {^0) • ■ ■ ! k} I M.m,j ~ (32) 

Such locations correspond to the indexes where the value of the bound ^ has been 
updated in order to be consistent with the condition Um,j G [—'zm,j,Zm,j] (see the 
previous paragraph) 

• When Um,k+i > +^m.fc+i, then a positive amplitude change has to be introduced in 
the m-th component within the time index {/cq, ..., fc}. The set of locations suitable 
for a change-point on the m-th component reads: 

Km — {j ^ {^0; • ■ • ; I — Zmj'^' (33) 

This set of locations corresponds to indexes where the value of the bound Umj was 
updated in order to be consistent with Um,j S [—Zmj , 2m,j]- 

• Else, when component m does satisfy ini), then we set Km = {^o, • • ■; 
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The change-point location fcrupt corresponds to the last location suitable for introducing 
the adequate amplitude change on each component, i.e., 

fcrupt = max j. (34) 

Once the change point location has been specified, we are able to assign a value to 
j.- When a negative amplitude change is detected on the m-th component, we 
set 

('^J s {fcoi • ■ • I fcrupt}) ^m,j = Xm,k+ly (^5) 

in consistence with (IT^ . Similarly, when a positive amplitude change is detected, we set 

(Vj G {fco; • ■ • 5 fcrupt}) 


4.1.5 Starting a new segment 

When a segment has been created, we start the detection of a new segment considering 
fco = fcrupt -f 1 as long as ko < N. 

According to ([7]) and by definition of the bounds, for every fc € {1,..., N} 


Xfc = Yk -u^, + Uk-i, 
xfc = yfc -Ufc -l-Ufe-i- 


(37) 


In particular, for fe = fcp, combining (IT^ . (fTS|) . (fTOl) and (1^ allows us to find the following 
initialization procedure 


Mfco — +Zfeo ! 

Wo = ~^ko I 

—fco “ y^o ~ “f UfeQ_i, 

Xfc-o = yfc'o + Zfco + Ufco-1, 


(38) 


where the value of Ufc^o-i is given according to Proposition 13.31 In addition, according to 
the writing of (ITbl) . uq = 0. 


4.2 Estimation of the auxiliary multivariate vector z 

In order to describe the generic behavior of the multivariate on-the-fly algorithm, we have 
so far assumed z to be known a priori. We now focus on the simultaneous estimation of the 
multivariate vector z and of the multivariate signal x. 

To provide an on-the-fly approximate solution, we propose: 

• to build a piece-wise constant estimator z of z, 

• to only consider amplitude changes jointly on all components m G {1,..., Mj. 
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4.2.1 Piece-wise constant estimator of z 

The proposed estimate is assumed to be constant between each change-point with values 
belonging to the predefined set Q defined in Section 1531 For each candidate value with 
q £ we create upper and lower bounds labeled . They 

are initialized at each new segment location fco and are updated independently according to 
((22)) and ((2^ until the prolongation condition 


nit > 


(39) 


based on ((24L does not hold anymore. In the following, we investigate how to modify the 
algorithm described in Section 14.11 to account for the automated selection of z in Q. The 
resulting algorithm is reported in Algorithmic) 


4.2.2 Estimate of the change point 

For every q G {1,...,|Q|}, we create change points as described in Section 14.1.41 The 
main difference consists in the restriction to simultaneous change points. As detailed after 
ProDOsition l3.ll non-simultaneous changes have a zero probability to occur. The restriction 
to simultaneous change-points will thus not impact the solution. It results that if there exists 
at least one component m G {l,...,M} such that < —Cm^ (resp. u^k+i > 

then 

= {j e {ko, ..., fc} I (40) 

(resp. = {j G {ko, ...,k}\ = -C^^}), (41) 

and, Vm_ ^ m such that -|- < 0, then 

= {j € {fco,..., fc} I = +d«)} (42) 

or, Vm+ ^ m such that +^m+.fc+i ^ 0, then 

= [j e {fco,..., fc} I = +d^)}. (43) 

A bivariate example of these configurations where the second component breaks Condi¬ 
tion (l3^ is provided in Fig. (3) The change-point location fc^upt the assignment of 
on the current segment follow ([34]), (l35|) and (IMj) . 


4.2.3 Estimate of the change point fcpupt 

According to the previous paragraph, the piece-wise estimation procedure leads to several 
possible change-point locations (at most |Q|). Here we select the solution indexed by q* 
with tightest bounds x*-'^ ^ and x*-'^ \ i.e., 


q* G Argmin 
1<9<|C| 



(44) 
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Figure 3: Example of configurations ieading to the detection of a change-point. In this 
example M = 2, = A/v^. Since > C,^\ condition (1391) is violated. 

The left (resp. right) plot displays the conhguration < 0 (resp. + 

IL^k+i — 0) described in Section [4.2.21 

with 

cr = diag((Ti,... ,crM), (45) 

where, for every m G {1,... ,M}, am stands for the standard deviation of The factor 
permits to ensure that every component contributes equally to the criterion (l4^ . When 
the minimizer of (j44p is not unique, we select the index q* yielding the largest In 

other words, we choose the set of auxiliary variables which permits to hold the prolongation 
condition (l3^ as long as possible. 

Therefore, it finally leads to an index q* which permits to estimate fcrupt = ^mpt and, 

(Vj G {fco,---,fcrupt}) Zj = \ (46) 

The starting location for the next segment is then, kg = fcrupt + 1, and the algorithm 
iterates as long as fco < N. 


4.2.4 Starting a new segment 

Let us consider the location kg of a new segment. For every q G {1,..., |Q|}, the initial¬ 
ization step can be recast into 

fulf = +c‘«'. = -C<«>, 

l^lf = yfco - C®' + Ufco-l. = yka + + UA-o-i, 

with uq = 0. 

Remark 4.2 The initialization step (1471) implicitly depends on the estimation of z made 
on the last segment through the term Simulations have shown that dUl) may lead 

to an inconsistent solution x as soon as z has been poorly estimated on a segment. Empiri¬ 
cally, a better approximation of the iterative solution is observed if each segment is treated 
independently, i.e.. 


=y,„ + C(^). 


(48) 
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Algorithm 2: On-the-fly Multivariate TV 

Data: Multivariate signal y = ..., G . 

Regularization parameter A > 0. 

Predefined set Q — • • ■, 

Starting location fco = 1. 

while ko < N do 

for g ■<— 1 to |Q| do 
Set A: fco 

Initialize primal/dual bounds according to (1481) 
while (1391) is satisfied do 
Set k <— k + I 
for m •(— 1 to M do 

Update primal/dual bounds 
if > +Cm^ or < -Cm^ then 

|_ Revise the update of primal/dual bounds 

Estimate a-nd (g) according to Section (14. 2. 21) 

\_ J '‘OSjS'Crupt 

Estimate fcrupt G (^^upt)i<q<|S| according to Section [4.2.31 
Estimate (xj)feo<f<fcrupt and {ij)ko<j<krppt according to (l46l) 

Set ko •«— fcrupt + 1 
Result: Solution x 


5 Performance assessment 

5.1 Experimental setting 

Unless specified otherwise, we consider that data consist of a M-multivariate piece-wise 
constant signal x G (solid black), to which a centered Gaussian noise e is additively 
superimposed: y = x -|- e G 

Signal X is generated as follows. First the length of each segment is drawn according to 
a folded Gaussian distribution A/’(12.5,16.25). Then, for each m G {1,..., M}, the ampli¬ 
tudes of the corresponding changes are drawn independently from a Gaussian distribution 
Ar(2,0.4). 

The exact minimizer of labeled x, is computed by means of the ADMM algorithm 
proposed in [34] . Iterations are stopped when the relative criterion error is lower than 
The proposed solution computed with the predefined set Q is denoted Xapprox.c- 

In a second set of experiments fsee 15.41) . the proposed on-the-fly algorithmic solution will 
be compared to an on-the-fly ADMM solution. 

5.2 Design of Q 

We propose to compare solutions Xapprox.g obtained with two sets Q = • ■ ■, in 

the bivariate case (i.e., M = 2) for N = 10^. For both configurations, we choose 

(Vg G {1,..., |Q|}) = (Acos( 6»,), Asm(6»q)) (49) 

with 9g G [0,7r/2]. The first solution consists to homogeneously cover the £2 ball such 
that, for some some parameter R G N*, 6q = g7r/2^+^ and |Q| = ■ The second 
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Figure 4: Influence of the design of Q. Comparison over 100 realizations of an homogeneous 
covering of the f 2 -ball (blue) against a random covering (red). Two experimental settings 
are considered depending on whether if is one order of magnitude larger than 1/2 (left) 
or not (right). Top; MSE(xa,pprox,s, x) for different set sizes |Q|. Bottom (2nd and 3rd 
lines): distributions of 6q» where q* has been selected by criterion (144 II for |Q| = 127. 

solution draws a set of the same size whose values {0q)i<q<\Q\ follow a uniform distribution 
on [0,7r/2]. 

Two experimental settings are investigated. In the first one, is one order of magnitude 
larger than ^2 (Fig- SI plots) whereas in the second one, both are of the same order of 
magnitude (Fig. IH right plots). 

Estimation performances in terms of mean squared error MSE(xapprox,c, x) = E[^ ||xapprox,Q— 
x|p] (where E stands for the sample mean estimator computed over 100 realizations) are 
reported on the first line. It shows that a random covering of the ^ 2 -ball provides solutions 
as good as the homogeneous covering up to the limit of |Q| small. 

On the 2nd and 3rd lines, the distributions of Oq *, where q* has been selected by criterion 
(133]), are reported for |Q| = 127. These histograms show the impact of the relative amplitude 
of the components on the distribution 6q »: components with same order of magnitude yield 
a symmetric distribution while unbalanced components yield an asymmetric distribution. 

For instance, in Fig. |3|(right plots), it appears more meaningful to draw 6q according to a 
Gaussian distribution than to a uniform distribution. Therefore, if one has some knowledge 
of components amplitudes, this can be incorporated to better design the set Q. This will 
also decrease the computational cost discussed in section 

In the following, we restrict ourselves to a random covering of the ^2 ball. 

5.3 Offline performance 

In this section, we focus on the comparison of offline performance, extended for M = 10, 
for two different signal-to-noise ratios (SNRs), namely 4dB and lOdB. 

Qualitative impact of |Q| on Xapprox.Q- For a single realization of noise, Xapprox.g and x 
are plotted Fig.|5|for A = 29, adjusted to provide the best visual (qualitative) performance. 
Solution Xapprox.g for |Q| = 5 X 10^ (light orange) provides a visually better approximation 
of X (dashed blue) than for |Q| = 10^ (mixed red). 
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Figure 5: Qualitative impact of \Q\ on Xapprox,Q- For visibility, only 3 components out of 
M = 10 are displayed. SNR = 4dB. Xapprox.s for |Q| = 5 x 10^ is more satisfying than 
for |Q| = 10® since it has more discontinuities in common with x. 


Estimation performance Xapprox,s vs. x. The quality of the approximation is further 
quantified Fig. [6] in terms of MSE(xapprox,Q, x) as a function of A for different |Q|. 

It shows that the MSE systematically decreases when \Q\ increases. Eurther, on the 
examples considered here and depending on A, using |Q| > 10^ no longer yields significantly 
improved solutions, thus showing that the selection of |Q| does not require a complicated 
tuning procedure. 

Estimation performance x vs. x and Xapprox.Q vs. x. Let us now compare the 
absolute quality of the solutions against x. 

MSE(x,x) and MSE(xapprox,s, x) for different |Q|, are reported in Eig.jT] MSEs are con¬ 
sistent with the previous paragraph: it shows that increasing |Q| up to a certain value per¬ 
mits to significantly lower the MSE. However, x has a lower estimation error than Xapprox.Q. 


5.4 Online performance 

In this section we focus on the comparison between two online solutions. The first one 
is derived from the proposed on-the-fly algorithm whereas the second one is based on an 
iterative algorithm. 

Comparison is made for different values of A on a signal x € [pM'xAf which a 

Gaussian noise is superimposed such that SNR = 3dB. Performance are provided for M = 2 
and M = 5 components. 

Proposed online solution Xoniine.c- As the time step k increases, Xapprox.Q is only com¬ 
puted up to the last ko and the algorithm has not yet output a solution on {feg -I- 1,..., k}. 
In that sense, the solution is said to be ”on-the-fly”. However, a solution Xoniine.c, providing 
an online approximation of x, can be output up to k by imposing limit conditions at k. 
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Windowed iterative solution Xwin.K- We consider a naive online ADMM version, where 
at each time step k a solution Xwm.if is computed by optimizing over the previous K points. 
The choice of K is of critical importance. On the one hand, if this value is too small, the 
observer may miss amplitude changes in the multivariate data stream. On the other hand, 
if the window size is too large, the computational cost may be too high to handle any on¬ 
line observation. Three window sizes have been investigated, respectively K = 20, 50 and 80. 

Computational cost. Comparisons of median computational costs per incoming sample 
(in seconds), over 10 realizations of noise, are reported Fig. [5] (left plots) as functions of A. 

As expected, we observe that the computational cost does increase along with the size 
of Q. Therefore, |Q| acts as a trade-off between the computational cost and the MSE. 
However, the computational cost of Xoniine.s is still several orders of magnitude lower than 
the one associated to the online ADMM. Interestingly, computational costs are comparable 
for M = 2 (top left plot) and M = 5 (bottom left plot). Note that a warm-up starting 
strategy for online ADMM only reduces by a factor two the computational cost with respect 
to the implementation displayed in Fig. [51 

The computational cost of Xoniine.Q could still be reduced in two ways. First, one could 
design the set Q according to a priori knowledge of components amplitudes (see 15.21) . Sec¬ 
ond, one could also benefit from the separable form of the algorithm and compute solutions 
x^'j) in parallel for every q S {1,..., |Q|}. 

Change-point detection accuracy. The Jaccard index J{cy.,f3) G [0,1] between any ot 
and (3 G [0,1]^ is defined as [T81I22] 


J{cx,(3) 


Efci min(ai,/3i) 


-I- Oii + '^l<i<N Pi 


ai>0,/3i>0 


i=0 




(50) 


It varies from 0, when a D /3 = 0, up to I when a. = (3. The Jaccard index is a demanding 
measure: As an example, if /3 G {0,1}'^ is the truth and if a G {0,1}'^ has correctly 
identified half non-zero values of (3 but has misidentified the other half, then J(q:, (3) = 1/3. 

The Jaccard index is used to measure the similarity between change-point locations 
of X and those obtained during the computation of Xwm.if and Xoniine.Q- To this end, we 
consider the change-point indicator vector r = {ri)i<i<N of x (as well as Twin,if and roniine.c 
respectively associated to x^in./f and Xoniine.s), defined as 


f I, if X has a change-point at location f, 

’■• = I 0, otherwise. 

In order to incorporate a tolerance level on change-point locations, r, Twin,if and roniine.c 
are first convolved with a Gaussian kernel of size 10 with a standard deviation of 3. 

T(rwin,if, J’) and J(roniine,g,'^) are averaged over 10 realizations of noise and reported 
in Fig. [8] (right plots) as functions of A for different set size |Q| and window size K. 

Performance show that J(ronline,Q, J') > J{rwin,K,r) for almost all A and |Q|. There¬ 
fore, Xoniine.c provides a better online detection of change-points of x. It also show that 
J{ronline,Qjr) does not vary significantly with |Q| but slightly decreases with M. Indeed, 
as M increases, the prolongation condition (1551) is more likely to be violated, thus leading 
to more change-points. 
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Figure 6: Estimation performance Xapprox.Q vs x. MSE(xapprox,Q, x) for different |Q|. SNR is 
set to 4dB (resp. lOdB) on left plot (resp. right plot). 
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Figure 7: Estimation performance x vs. x and Xapprox,Q vs. x. MSE(x, x) and 
MSE(xapprox,s,x) for different |Q|. SNR is set to 4dB (resp. lOdB) on left plot (resp. 
right plot). 
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6 Conclusion 

In this contribution, we have developed an algorithm which provides an on-the-fly approxi¬ 
mate solution to the multivariate total variation minimization problem. Besides a thorough 
examination of the KKT conditions, the key-step of the algorithm lies in updating and con¬ 
trolling the range of the upper and lower bounds of the dual solution within a tube of radius 
A. An on-the-fly derivation is achieved by means of an auxiliary vector z, which needs to be 
estimated, providing information on the angle of contact with the tube. The latter estima¬ 
tion strongly affects the quality of the solution and the proposed on-the-fly estimation of z is 
currently achieved by assigning a value chosen within a predefined set Q. It has been shown 
that the size of Q permits to achieve a desired trade-off between the targeted quality of 
the solution and the application-dependent affordable computational cost. In addition, the 
proposed method could also be extended to other penalization norms in the right-hand 
side of (ED, for p > 1. However one would still face the issue of estimating z which would 
have to lie within a ^p ball of radius A. Under current interest is the investigation of how 
to estimate z in the case where the assumption of piece-wise constant behavior is a priori 
relaxed. 


7 Appendix 

7.1 Proof of Equation (1^ 

According to the primal-dual relation (O, for every to € {1,..., M} and k £ {1,...,A^ — 1}, 
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Figure 8: Online Performance. The proposed solution Xoniine.s is displayed in solid line while 
the online ADMM solution Xwin,K is displayed in dashed line. Performance for M = 2 
(resp. M = 5) are illustrated top (resp. bottom). Left: median computational cost per 
incoming sample (in seconds). Right: J(rwin,K,r) and J(T’oniine,Q, r) for different values 
of \Q\ and K. 


and by definition of the lower and upper bounds of Xra,k and Urn,k-, we have 

“ ym,k + Um,k -1 ~ 3Lm,ki (^ 3 ) 

Um,k — ym,k T Um,k—1 Xm,k- 
By subtracting (l5^ from (l5^ we obtain 

Um,k ~ ILm^k ~ 3Lm,k ~ ^m,k (55) 

and, according to m. ~ Urn fc ^ 0- The arguments are similar for proving that 

Um,k ^ Um,k- 

7.2 Proof of Equation (|27]) 

For every m G {1,..., M} and k G {ko,... ,N — 2}, if 

M.m,k+1 ~ 'ikm.,k “f 2/m,fe+l ~ ^m,fc ^ +-2^m,fc+li (56) 

then updating rules of j,, specified in (l23ll . have under-evaluated its value v_^. To modify 
the lower bounds (a:^ j)ko<j<k+i, on the one hand, we consider the cumulative sum of the 
observations which, according to 0, leads to 

fe-i-i 

^ ^ ym,j — ttm.fc-t-l Um,kQ {k k{) \)Xm,k+l^ (57) 

j=ko + l 

and thus, if = +^rn,k+i, would lead to 

fc+i 

'y ( ymj ~ ^m,k+l '^m,ko {k k^ l)i£yy^, (58) 

j=ko + l 


19 



















by definition of other hand, the updating rules (f^ and (P?]) have led 

to 

fc+i 

Mm.fe+l = U^,fco + X! 2/m.i - (fc - fco + (59) 

i—^0+1 

The combinaison of (1551) and (1551) leads to 


= X. 


■m,k 


+ 


Mm,kg ”1” M.m,k+1 Zm,k+1 + Um,ko 

k — kn 1 


(60) 


Because have been under-evaluated and by definition Um,ko < lLm,ko^ propose 

the following value 


iLm,k+l Zm,k+1 

k — ko + 1 


(61) 


in order to adjust the lower bounds, i.e., 


(Vj e 1}) Xrn,j=Em- (62) 

In addition, as a result of G [—'Zm,k+i, +Zm,k+i] and according to the inequality (I20L 

we set 

Hm.fc-l-l “ ~\~Zm,k+l- (63) 
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